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Most data based state and parameter estimation methods require suitable initial values or guesses to achieve 
convergence to the desired solution, which typically is a global minimum of some cost function. Unfortunately, 
however, other stable solutions (e.g., local minima) may exist and provide suboptimal or even wrong estimates. 
Here we demonstrate for a 9-dimensional Lorenz-96 model how to characterize the basin size of the global 
minimum when applying some particular optimization based estimation algorithm. We compare three different 
strategies for generating suitable initial guesses and we investigate the dependence of the solution on the 
given trajectory segment (underlying the measured time series). To address the question of how many 
state variables have to be measured for optimal performance, different types of multivariate time series are 
considered consisting of 1, 2, or 3 variables. Based on these time series the local observability of state variables 
and parameters of the Lorenz-96 model is investigated and confirmed using delay coordinates. This result is in 
good agreement with the observation that correct state and parameter estimation results are obtained if the 
optimization algorithm is initialized with initial guesses close to the true solution. In contrast, initialization 
with other exact solutions of the model equations (different from the true solution used to generate the time 
series) typically fails, i.e. the optimization procedure ends up in local minima different from the true solution. 
Initialization using random values in a box around the attractor exhibits success rates depending on the 
number of observables and the available time series (trajectory segment). 
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For many physical processes dynamical models 
are available but often not all their state variables 
and (fixed) parameters are known or easily acces¬ 
sible. In meteorology, for example, sophisticated 
large scale models exist, which have to be con¬ 
tinuously adapted to the true temporal changes 
of temperatures, wind speed, humidity, and other 
relevant physical quantities. In quantitative biol¬ 
ogy mathematical models of single neural or car¬ 
diac cells or networks may contain many state 
variables and parameters whose values are not 
easy to measure (without destroying the system). 
In such cases, data based estimation methods can 
be used to determine these unknown states and a 
parameters by adapting a suitable model to re¬ 
produce and predict the measured time series. 
This approach can be successful only if two con¬ 
ditions are fulfilled: (1) the available data have to 
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provide sufficient information, i.e. the unknown 
state variables and parameters have to be ob¬ 
servable and (ii) the estimation algorithm has to 
be properly initialized with initial guesses suffi¬ 
ciently close to the true solution. Here, we con¬ 
sider both problems for the Lorenz-96 model and 
compare different initialization methods in terms 
of their effective basin sizes. 


I. INTRODUCTION 

Estimation methods for state variables or 
(fixed) parameters can be implemented employing 
synchronization^^' or optimization methods®^®, for 
example. In the literature one can find many examples 
with successful applications of state and parameter 
estimation methods even for chaotic systemsIn 
practice, however, attempts to fit a model (for example, a 
set of nonlinear ordinary differential equations (ODEs)) 
to given data may fail. There are many possible reasons 
for such a failure, including inappropriate models, poor 
quality of the measured time series (too noisy, too 
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short), or external perturbations not covered by the 
model. But even with relatively clean data and the 
right model architecture, estimation may turn out to 
be difficult, because the available data do not contain 
sufficient information about the underlying process. 
Therefore, in this article we address how the success of 
a given estimation algorithm for a given model depends 
of the following aspects: 

(a) the number of available observables (in a multivari¬ 
ate time series) 

(b) the available time series (corresponding to some 
particular trajectory segment) 

(c) and the way the estimation algorithm is initialized 
(using guesses for the unknown quantities). 

The focus in the presented analysis is on models given by 
ODEs, 

^=F(x(f),p,t), (1) 

and a measurement function, 

y(t) = h(x(<)) e , (2) 

representing the model output with a state vector x(t) = 
... ,XD{t)Y'^ S R^, and model parameters p = 
{pi,... G R'^^. Here and in the following the 

superscript “tr” denotes the transpose. We assume that 
a multivariate L-dimensional (experimental) time series 
{r]{n)} is given consisting of iV-|-l samples q{n)=r]{tn) G 
R^, analogous to the model output, and measured at 
times T = {tn = n ■ At \ n = 0,1,..., N}. The observa¬ 
tion times tn are equally spaced (with a fixed time step 
At) and start at to = 0. Any solution of Eq. (1) at dis¬ 
crete times tn = n-At with a fixed time step At is denoted 
by {x(n)} and consists of A-|- 1 samples x(n)=x(t„) at 
times tn G T. If, from the context. At and the range 
of n are clear, this information will be dropped in the 
following. The same convention holds if a solution is de¬ 
noted by another symbol, for example z instead of x: 
the solution {z(n)} consists of A-l-1 samples z(n)=z(t„) 
measured at times t„ G T- 

As an example we use synthetic data from a 9- 
dimensional Lorenz-96 system^® (Sec. II) and an op¬ 
timization based estimation algorithm® (Sec. IV). We 
check the observability of the state variables of the 
Lorenz-96 model which are not “measured” (i.e., not con¬ 
tained in the multivariate time series) using a local anal¬ 
ysis (Sec. Ill) employing the Jacobian matrix of the delay 
coordinates map'^®d7^ This analysis indicates that even 
with a single observable (scalar time series) all state vari¬ 
ables and the parameter of the Lorenz-96 system are in 
principle (locally) observable. 

To investigate global convergence features (using initial 
guesses that are not close to the true solution) we probe 
the basin structure of the observability problem by con¬ 
sidering 18 different trajectories of the Lorenz-96 system 


(on the same chaotic attractor, but generated with differ¬ 
ent initial conditions). From each trajectory 15 different 
(multi-variate) time series are derived consisting of one, 
two, or three observables. Then a particular method for 
generating initial guesses to initialize the optimization 
algorithm is chosen, and the estimation algorithm is ap¬ 
plied to each of these 15 time series 500 times (with dif¬ 
ferent random initial guesses) to obtain statistics of how 
often the estimation problem is solved successfully. In 
other words, we compute the probability that a gener¬ 
ated initial guess is located in the basin of the true so¬ 
lution of the given optimization algorithm. This method 
of estimating the “basin size” was adopted from Menck 
et al.^®. 

In Sec. II we introduce the Lorenz-96 model which will 
serve as an example for the following studies. First, lo¬ 
cal observability of the state variables and the parameter 
of the Lorenz-96 model is investigated and confirmed in 
Sec. III. Then, in Sec. IV we present the estimation algo¬ 
rithm used and in Sec. V our approach for characterizing 
the size of the basin of the true solution is introduced. 
The true solution is a stable fixed point of the optimiza¬ 
tion algorithm with a basin of attraction and the desired 
estimation of the true solution is only possible if the op¬ 
timization algorithm is initialized with guesses from this 
basin. To check the stability of this fixed point the opti¬ 
mization procedure was initialized by initial guesses con¬ 
sisting of randomly perturbed true values. For all these 
initial guesses the optimization results converged to the 
true solution. However, since in general the location of 
the true solution is not known the size and the struc¬ 
ture of its basin are most important for any initialization 
strategy. Three possible initialization methods (Sec. VB) 
are invested in detail and compared in terms of their ef¬ 
ficacy for finding the true solution. All results are sum¬ 
marized in the conclusion drawn in Sec. VI. 


II. EXAMPLE: THE LORENZ-96 MODEL 

As an example for demonstrating the proposed analysis 
we consider in the following a D = 9 dimensional Lorenz- 
96 model 

H T ■ 

= x,_i(f) • (x,+i(t) - x,_2(t)) -X,+P (3) 

dt 

with p = 8.17 and a cyclic index i (xn+iit) = xi{t), 
xo{t) = xnit), and X-i{t) = xn-iit)). For the parame¬ 
ter value p = 8.17 the model generates a chaotic attrac¬ 
tor. 

The Lorenz-96 model is chosen here as an example 
because previous investigations showed that it is very 
difficult to estimate its state variables and the param¬ 
eter p using only a few observables^®. Recently, however, 
Rey et. al.'^’ demonstrated successful state and parameter 
estimation based on univariate time series consisting of 
a single Lorenz-96 state variable and a synchronization 
scheme employing delay coordinates. Law et al.®® applied 




3 


the extended Kalman filter and the 3D-VAR data assimi¬ 
lation technique to the chaotic Lorenz-96 model and also 
encountered difficulties in the estimation of model state 
variables if only few model state variables are observed. 

Technically, the Lorenz-96 model (3) is used here in a 
twin experiment for both, (i) generating the “measured” 
time series and (ii) as a model to be adapted to a (multi¬ 
variate) time series using the optimization based estima¬ 
tion method described in Sec. IV. 

To address the question how many observables 
have to be known for successful state and parameter 
estimation we consider multivariate time series {f]{n)} 
with one, two, or three state variables. More precisely, 
for the 9 dimensional Lorenz-96 model we consider all 
possible combinations of one to three state variables 
as being “measured”. For example, let us assume we 
can measure the state variables (xi, X2, X5). Due to the 
symmetry in Eq. (3), sampling (xi, X 2 , X 5 ) is equal to 
measuring {x 3 ,X 4 ,X'r) or {xt, Xs, X 2 )- Hence checking 
the observability of all state variables and the parameter 
p with the given multivariate time series {xi,X2,X5) is 
equivalent to checking the observability with the time 
series (xr,xs,X2)- Removing all mathematically equiv¬ 
alent combinations results in the following 15 distinct 
combinations of state variables: xi, (xi,X2), (xijXs), 

(xi,X4), (xi,X5), (xi,X2,X3), (xi,X2,X4), (X4,X2,X3), 

(X1,X2,X6), {XI,X2,X7), {Xi,X2,Xs), (xi, Xa , Xs) , 

(a:i,a: 3 ,a; 6 ), (xi, 0 : 3 ,xy), and (xi,X 4 ,X 7 ). 


III. LOCAL OBSERVABILITY OF MODEL STATE 
VARIABLES AND FIXED PARAMETERS 

We consider models given by a set of D coupled ODEs, 
Eq. (1), with a measurement function, Eq. (2), represent¬ 
ing the relation between model states x(n) and the model 
output y(n) corresponding to the observations {ri{n)}. 
The state vector(s) x(t) and the model parameters p are 
unknown and have to be estimated from a (multivariate) 
time series. The technique used in this article to adapt 
a model given by ODEs (1) to a (multivariate) time se¬ 
ries given by {'ri{n)} with a measurement function (2) 
will be described in Sec. IV. Similar to other methods 
for state and parameter estimation, this algorithm will 
provide estimates for the model state variables and the 
(fixed) model parameters (except if, for example, numeri¬ 
cal problems arise). The fact, however, that an algorithm 
produces some output does not mean that this output is 
correct or useful. Therefore, a method is needed which in¬ 
dicates whether it is (in principle) possible to estimate p 
and x(f) correctly from {r]{n)}. This question addresses 
the general problem of observability, which is well known 
from control theory^^“^®. In the following section we shall 
employ the time delay coordinates map of the observed 
time series to investigate local observability following an 
approach presented in Refs.^®’'^^. 


A. The delay coordinates map 

Let the dynamical system (1) generate a flow 

: IR^ 0 ^ R^ (4) 

(x(f),p) x(t-l-r) 

mapping a state x(t) at time t G R to a (future) state 
x(f -|- t). Furthermore, delay coordinates are given via 
the L dimensional measurement function Eq. (2) 

y{t + T) = h(x(t-kr)) = h{(j)^{x{t),p)) (5) 

from a trajectory starting at x(t) with delay time t. If 
the delay time is t = 0, then we obtain (/)°(x(t), p) = x(t) 
and recover y(f) = h(x(<)). Taking into account K time 
steps we can define a Dm = K • L -dimensional delay 
coordinates map 

s = S(x(t),p) 

= (y*’'(t), y*''(i + r),. .., y‘^(t + {K- l)r)) . (6) 

Here the delay coordinates map S is considered as a func¬ 
tion of: (i) the state x(<) and the parameters p of the un¬ 
derlying system, and (ii) of the delay time t (not listed as 
an argument of S here, because r is fixed and not part of 
the estimation problem). All y‘"'(t-|-Lr), i = 0,..., AT— 1 
are row vectors. Hence, the right hand side of Eq. (6) is 
a (row) vector containing K ■ L elements. 

If the delay coordinates map Eq. (6), S : R^ 0 R^^ —>• 
r£>m, jg locally invertible, then the full state x(t) and 
the parameter vector p can be uniquely determined from 
the signal h(x(t)), which, in a real world experiment, 
corresponds to the measured time series {r;(t„)} Eq. (2). 
Mathematically, the delay coordinates map Eq. (6) has 
to be an immersion^®, i.e. the Jacobian matrix Dx.pS = 
Dx,pS(x(t), p) of the delay coordinates map S has to have 
maximal (full) rank. 

The accuracy and robustness of estimated state vari¬ 
ables or fixed parameters can be quantified by the uncer¬ 
tainty 

= ^[Dx,pS‘--. Dx.pS]-/ (7) 

of state variables (j = and parameters {j = 

D -\- 1,... ,D -\- Np) which was introduced in Ref.^^’^^. 
Perturbations of the measured time series are amplified 
by Vj, i.e. the larger Vj the less precise is the estimation of 
the corresponding state variable or fixed parameter. Note 
that the uncertainty Vj depends (via Dx,pS(x(t),p)) on 
the location in state and parameter space. 

B. Local observability of tbe Lorenz-96 model 

To assess the local observability of the states x(t) and 
the (single) model parameter p of the D = 9 dimensional 
Lorenz-96 model (3) their uncertainty is checked at 10^ 
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arbitrary reference points on the attractor. To obtain 
the reference points the Lorenz-96 model was integrated 
10^ steps with a step size of 0.01 using a Runge-Kutta-45 
integration scheme. Then every 1000th point was picked 
as a reference point x(t) for the observability analysis 
described in the following. 

As mentioned in Sec. II for the 9-dimensional Lorenz- 
96 model there are 15 different combinations of one to 
three state variables constituting a multivariate time se¬ 
ries. We select the following two cases as representative 
examples: 

(a) measurement function /i(x(t)) = xi{t) (i.e., L = 1 ) 
with K = 12 and a resulting delay reconstruction 
dimension of Dm = 12 and 

(b) measurement function h(x(t)) = 

{xi{t),X 3 {t),XQ{t)) (i.e., L = 3) with K = 4 
and hence a delay reconstruction dimension of 
Dm = 12 (see Eq. ( 6 )). 

The reconstruction dimension Dm = L - K is the same 
in both cases. Histograms of the uncertainties vj, for 
j = 1,..., 10, Eq. (7), are computed for the 10^ reference 
points on the attractor. Figure 1 shows the histograms 
for vi, and viq which are plotted vertically using color 
coding (relative frequencies of the corresponding uncer¬ 
tainties are given in percent, see color bar). All distribu¬ 
tions shown here are unimodal. The left column shows 
the results for h{x{t)) = xi{t) and the right column for 
h(x(t)) = {xi{t),X 3 {t),XQ{t)). In both cases the uncer¬ 
tainty i^i corresponds to the “measured” state variable 
xi, corresponds to the “hidden” state variable x^ (not 
measured directly), and viq corresponds to the model 
parameter p. Histograms for the other Vi are not shown, 
because they look very similar to the histograms for vi (if 
the corresponding state variable is measured) and (if 
the corresponding model state variable is unmeasured). 
The reason for these similarities is the symmetry in the 
Lorenz-96 model equations (3). 

For both measurement functions one can see that 
the uncertainty values of the maxima of the histograms 
exhibit a U-shaped dependence on the delay time t. 
The smallest uncertainties occur between r = 0.11 and 
T = 0.21 for the unmeasured state variables (Figs. lc,d). 
In both cases the distribution of the uncertainty vi of 
the measured state variable (and similar for 1/3 and vq 
if i? = 3, not shown) possess a sharp peak around 
logj^o Vi = 0. In this case the state variable xi is an 
observed quantity, and therefore, the delay reconstruc¬ 
tion does not provide much further information about 
its values. For relatively large delay times ( r > 1 in 
(a) and r > 4 in (b)) the delay reconstruction becomes 
rather “poor”. The measurement noise is amplified re¬ 
sulting in a tail of the histogram with uncertainty values 
larger than one (the yellow areas above the horizontal 
line at log(z^i) = 0). This result is not surprising, be¬ 
cause in nonlinear time series analysis it is well known 
that the delay time has to be chosen carefully and must 


not be too large for chaotic attractors, since otherwise 
the reconstructed attractor will be heavily distorted. In 
contrast, the values of the centers of the distributions of 
the uncertainty 1^10 of the parameter p decrease with in¬ 
creasing T until the maximum of the distribution is below 
one (see Figs. le,f) (without exhibiting a clear minimum). 

As mentioned before, these two examples are represen¬ 
tative for all multivariate time series from the Lorenz-96 
model consisting of combinations of one to three state 
variables. The other 13 combinations show similar his¬ 
tograms. If a certain state variable in one of the other 
combinations is measured, then the corresponding his¬ 
togram of the corresponding uncertainty looks similar to 
the histograms of vi in Fig. la. The same holds for state 
variables that are not contained in the multivariate time 
series (and where the corresponding v histograms look 
similar to the histograms of 1/5 in Fig. 1) and the model 
parameter p (the corresponding vio histograms look simi¬ 
lar to the histograms of viq in Fig. 1). With an increasing 
number of measured state variables (from one to three) 
the maxima of the distributions at the minimum of the 
histograms of v for unmeasured state variables move only 
slightly in the direction of smaller v (see, for example. 
Figs. lc,d). This trend also holds for all combinations 
of four measured state variables (not shown here). The 
fact that the histograms for one to three observed state 
variables look almost the same means that the loeal ob¬ 
servability of the model parameter and the unmeasured 
state variables is maintained, even if only a single model 
state variable is measured instead of three (or more). In 
the following we shall investigate whether this local result 
holds for states that are not close to the true solution. 
Specifically, we want to know if we can uniquely recover 
the true solution using a univariate time series (from a 
single observable), even if the initial guesses are far from 
the true solution. Since the true solution is a stable fixed 
point of any optimization based estimation algorithm, 
we are thus interested in the ’size’ and the structure of 
the basin of attraction of this fixed point. To address 
this question we shall introduce in the next section a 
particular estimation algorithm which is used herev as a 
prototypical example for investigating the solution basin. 


IV. STATE AND PARAMETER ESTIMATION 
ALGORITHM 

The method used in this study to adapt a model to 
a time series is based on minimizing a cost function 
and in the context of this paper it represents only one 
out of many different state and parameter estimation 
methods'’’^ where the same type of basin size analysis 
could be applied. The method was introduced in Ref.® 
and will be summarized in the following. 

The goal of the estimation process is to find a set of 
values for the model state variables x(t) at each time 
step of its discretization and the model parameters p 
such that the model equations, given by a set of ODEs 


5 



Figure 1. (Color online) Probability distributions (color coded frequency in %, vertically plotted) of uncertainties vi, and 
v\o vs. delay time r for the Lorenz-96 model. The uncertainties vi and correspond to state variables xi and x^, respectively, 
and zzio corresponds to the parameter p. In the left column ((a),(c),(e)) a scalar (L = 1 dimensional) measurement function 
h{Ti{t)) = xi{t) is used for generating delay coordinates with K = 12. The right column ((b),(d),(f)) shows results for the 
L = 3 dimensional measurement function h(x(t)) = {xi{t),X 3 {t),xe{t)) and a delay reconstruction (Eq. ( 6 )) with K = 4. Hence 
Dm = 12 dimensional delay coordinates (Eq. ( 6 )) are constructed in both cases. The histograms for ui are very similar to 
histograms obtained for other measured state variables (* 3 , xe in the right column), not shown here. Similarly, the histograms 
of 1/5 are representative for histograms of the remaining unmeasured state variables (not shown here). 


(see Eq. (1)), provide via the measurement function (2) 
a model times series {y(n)} consisting of iV -b 1 sam¬ 
ples y(n)=y(t„) S with G TVn that matches the 
experimental time series { 77 ( 71 )}. In other words, the av¬ 
erage difference between 77 ( 71 ) and y(7i) should be small. 

Furthermore, the model equations should be fulfilled as 
well as possible. This means that modeling errors u(t) 
are allowed, but should be small. Therefore, model (1) 
is extended to include modeling errors u(t), 

= F(x(t),p,t)-bu(t), (8) 


ence 77 ( 71 ) — y(n) for all n G T- Technically, this opti¬ 
mization problem can be implemented in different ways'^ 
and in the following we use unconstrained optimization® 
employing automatic differentiation'*. The cost function 
used in this study can be derived from a general proba¬ 
bilistic description of the estimation problem assuming 
Gaussian distributions (also called weakly constrained 
4D-VAR in geosciences) and consists of four terms 

C({x(77)},p)=C'i + C2+C'3+C4 (10) 

with 


so that when u{t) is small the model trajectory x(t) 
closely matches the model equations. To incorporate 
model error into the optimization, we discretize u(<) and 
x(t) at times G T so that the state variables {x(n)}, 
x(n)=x(t„), at each time n must be estimated in addi¬ 
tion to p. For simplicity, we choose x(t) and y(t) to be 
sampled at the same times that the data are observed. 
Similarly, u(t) is discretized to {u(7i)} with u(rz)=u(t„) 
and tn G T. The discretization of ( 8 ) is given by 


U(7l) 



F(x(7i),p,t„), 


( 9 ) 


N 


Cl = 


Co = 


N+1 

1 — a 

N+1 




n—O 

N 


J2n{nrBu{n) 


( 12 ) 


n—0 

N-2 


= iV^l ■ (^apr(n) - x(n))‘"E (Xapr(77) - x(7l)) 

n—3 

(13) 

Ci = P- q(w, wi, Wu)*'' • q(w, Wi, w^) . (14) 


where the symbol ^ | ^ stands for the finite difference 

approximation of at time tn- 

The goal of the adaption process is to minimize (on 
average) the norm of u(7i) and the norm of the differ- 


The term Ci penalizes the difference between rj{n) and 
y(7i) whereas C 2 penalizes large magnitudes of u(7i). 
In the term C 3 a Hermite interpolation is performed 
to determine Xapr(7i) from neighboring points and the 
time derivatives which are, according to (8), given by 
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F(x(i), p, i) + u(i) and provide the approximate solutions 

11 8 
Xapr(n) =— [x(n - 2) + x(n + 2)] + — [x(n - 1) 

+x(n + 1)] + ^ [F(x(n - 2), p, t„_ 2 ) 

+u(t„_ 2 ) - F(x(n + 2), p, tn+ 2 ) - u(t„+ 2 )] 

4 At 

+ [F(x(n- l),p,t„_i) +u(t„_i) 

-F(x(n + 1), p, t„+i) - u(t„+i)] . (15) 

Smoothness of {x(n)} is enforced by small differences 
Xapr(n) — x(n). The term C 3 suppresses non-smooth (os¬ 
cillating) solutions which may occur without this term in 
the cost function. In this paper the weight matrices A, 
B and E are diagonal matrices. The diagonal elements 
can be used for an individual weighting. 

The solution ({x(n)},p) obtained through the opti¬ 
mization of the cost function ( 10 ) is taken to be the max¬ 
imum likelihood estimate. 

Let 

w = ({x(n)},p) (16) 

be a vector containing all quantities to be estimated. To 
force w to stay between the lower and upper bounds 
wi and Wu, respectively, the vector valued function 
q(w, wpWu) = (gi,... is defined as 

{ Wu,i - Wi for Wi > Wu,i 
0 for wii <Wi< Wu,i 

Wiy - Wi for Wi < Wi^i. 

(17) 

qi is zero if the value of Wi lies within its bounds. To 
enforce this, the positive parameter /3 is set to a large 
number, e.g. 10 ®. 

The homotopy parameter a can be used to control 
whether the solution should be close to data (a « 1 ) or 
has a smaller error in fulfilling the model equations. In 
Ref.®*’ a technique is described to find an optimal a. Fur¬ 
thermore, one might use continuation (see Ref.’') where 
a is stepwise decreased. Starting with a « 1 results in a 
solution close to the data. Then, a is slightly decreased 
and the previously obtained solution is used as an initial 
guess to optimize the cost function again. This procedure 
is repeated until the value a = 0.5 is reached. 

Note that the cost function can be written in the form 

,7 

C(w) = ^i7j(w)2 = ||H(w )||2 (18) 

i=i 

where H(w) is a high dimensional vector valued func¬ 
tion of the high dimensional vector w. To optimize (18) 
we use an implementation of the Levenberg-Marquardt 
algorithm®’’®^ called sparseLM ’'®. Although C'(w) will be 
optimized, sparseLM requires H(w) and the sparse Jaco¬ 
bian of H(w) as input which is computed using the au¬ 
tomatic differentiation tool ADOL-C®"’’®®, and described 
in more detail in**. 


V. DETERMINING THE BASIN SIZE OF THE TRUE 
SOLUTION 


In Sec. IV we described a state and parameter estima¬ 
tion algorithm that has to be initialized with guesses for 
all model state variables x(t„) at each time step and 
all fixed model parameters p. This set of values forms an 
initial guess, which must be supplied to the optimization 
algorithm. In this section three different methods for gen¬ 
erating the initial guesses are presented and simulations 
consisting of twin experiments are performed to deter¬ 
mine which of these methods gives the best estimates for 
the model state variables and fixed parameters. These es¬ 
timates are then compared with the true solution which 
is known exactly in this case since this is a twin exper¬ 
iment. Due to the fact that the methods for generating 
the initial guesses, in a certain way, depend on random 
numbers and the outcome of an estimation process is ei¬ 
ther successful (estimated states and parameters are close 
to the ones used to generate the data time series) or not 
successful (estimated states and parameters are not close 
to the ones used to generate the data time series) the sim¬ 
ulations can be considered as Bernoulli experiments and 
the basin size of initial guesses leading to the true solu¬ 
tion can be determined, as suggested by Menck et ah’® 
in another context. 


A. The simulation 


First we generate 18 “true” trajectories {*z(n)} with 
7 = 1,..., 18 by integrating the 9-dimensional Lorenz-96 
model (3) with 18 different initial conditions z(0) on the 
attractor with At = 0.01 and N = 1500 using the model 
parameter p = 8.17. Then Ajguess = 500 initial guesses 
({)jX(n)}, ^p) (h = 1,..., Aiguess) of the model state vari¬ 
ables and the (fixed) parameter p are generated which are 
used for initializing the estimation procedure (the estima¬ 
tion algorithm was described in Sec. IV). Three different 
methods for generating the initial guesses will be pre¬ 
sented in Sec. VB. The following steps in the simulation 
do not depend on the specific choice of the method for 
creating the initial guesses. 

From each of the 18 true trajectories {*z(n)} with 
i = 1,..., 18, according to Sec. II, 15 multivariate time 
series were extracted corresponding to the 15 different 
combinations of state variables assumed to measured. 
This gives 270 different multivariate time series with one. 
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two, or three state variables: 

{*zi(n)}, {\zi(n),Z2{n))}, {\zi{n), Z3{n))}, 
{\zi{n),zi{n))}, {\zi{n), z^{n))}, 
V{zi{n),Z2{n),Z3{n))}, {\zi{n), Z2{n), Zi^n))}, 
V{zi{n),Z 2 {n),Z 5 {n))}, {^{zi{n), Z 2 {n), ze{n))}, 
{\zi{n),Z2{n),zr{n))}, {\zi{n), Z2{n), Zs{n))}, 
V{zi{n),Z3{n),Z5{n))}, {*(^i(n), ^ 3 ( 71 ), Z6(«-))}, 
Vizi{n),Z 3 {n),Z 7 {n))}, {*(zi(n), 24 ( 71 ), 27 ( 71 ))} (19) 

with i = 1,..., 18, At = 0.01, n = 0,1,..., A and N = 
1500. 

To make the simulation more realistic, white noise 
(normally distributed random numbers) is added to these 
270 clean, multivariate times series. This results in 270 
noisy multivariate time series {* 77 '^( 7 i)} with At = 0.01, 
77 = 0,1,..., A and N = 1500. 

Each noisy time series is computed by 

* 77 ^( 77 ) = (*z(77)) + ats^ein ), ( 20 ) 

where (Xts = 0.2 and *^'^( 77 ) = (®^}(77),..., *^£( 77 )) G 
are independent, normally distributed random variables 
with zero mean and a variance of one, ~ A/’(0,1). 

The index i = 1,..., 18 describes the true trajectory 
1 * 2 ( 77 )} from which the data time series was extracted. 
Index c indicates which state variables were measured. 
For example, c = (1 — 2 — 6) means that the state vari¬ 
ables 2 i, 22 and Zq are the measured state variables. The 
label h = 1,..., Ajguess describes with which initial guess 
the estimation algorithm was initialized. The measure¬ 
ment function h'*(x(t)) is always chosen according to the 
measured state variables defined by c. If, for example, the 
state variables 21,22 and 27 are measured (and therefore 
c = 1 — 2 — 7), then the measurement function is given 
by h*=(x(t)) = {xi{t),X 2 {t),X 7 {t)). 

To each of the 270 multivariate time series {*i 7 °( 77 )} the 
Lorenz-96 model is adapted Aiguess = 500 times, whereas 
each of the Ajguess estimation processes is initialized with 
one of the previously generated Ajguess different (random) 
initial guesses using the estimation algorithms described 
in Sec. IV. This means that Ajguess • 270 = 500 • 270 = 
135000 estimation problems are solved. 

For each solution of the estimation processes the dif¬ 
ference between the true and the estimated solution is 
given by the estimation error 


< 10“^, else the estimation is considered as not suc¬ 
cessful. The value for the estimated fixed model param¬ 
eter is considered as successful, if G [8.16,8.18] 
(remember, the true trajectories {*2(77)} were generated 
with p = 8.17 in Eq. (3)). 

We are interested in a quantity (in percentage) which 
tells us how many estimations with a specific true trajec¬ 
tory, 7, and a specific combination of observed state vari¬ 
ables, c, of the model state variables are successful. In 
other words: For how many of the Ajguess = 500 estima¬ 
tions using Ajguess different initial guesses with a specific 
true trajectory, 7, and a specific combination of observed 
state variables, c, is the estimation of the model state 
variables successful, i.e. < 10“^? This quantity, 

which of course depends on the true trajectory and the 
combination of measured state variables, is defined here 
as the success rate of the estimation of the model state 
variables (in percentage) 




with Ch 


100 % 

A 


Ni 


iguess 


guess 

E 

h^l 


1 


f 0 , if > 10-2 

}1, if < 10-2 • 


( 22 ) 


One can also define an error which depends on the esti¬ 
mated solution and the data, only, as 


1 J 

= (]vTiFl II 2 • (23) 

Assume one estimates the best possible solution. That is, 
if the estimated solution is equal to the trajectory (with¬ 
out noise) used to generate the data, = *2(77). In 

this case Eq. (23) is (using eq. (20)) 


I T^C 

/i'^obs,opt 


1 

''{N+l)L 


N 


E ir^'^(«) 


n—O 


^ts 2/AC 


(A+1)L 






(*2^(77)) 


2 

2 


(24) 


where 




C 


VVCffWl* ■ 


n—0 1 — 1 


(25) 


The indices of \E‘^, *2'*(77) and have the same 

meaning as for {}x°(77)}. The smaller the error mea¬ 
sure JjA"* the closer the estimated solution for the model 
state variables is to the true solution and hence the more 
accurately the estimation problem was solved. The es¬ 
timation of state variables is considered as successful if 


Because are independent, standard normal ran¬ 

dom variables, is chi-squared distributed, ^ 
X(jv+i)L) with (A -I- 1)L degrees of freedom*®. The ex¬ 
pectation value is then given by A [*(5'*] = (A -|- 1)L, 
leading to an expectation value for }A£j,g of 

E = 4 ■ (26) 

The variance of the chi-square distribution is Var = 







2(iV + 1)L. The variance of is then 


Var [Lifo^bs.opt] = Var 


•^ts ir\c 


{N+l)L 






[(7V + 1)L]2 

24 

(iV + l)T 


Var [*Q"] 


giving the standard deviation 


Std [Li?o“b..opt] = V Var 


I rpc 

/i'^obs,opt 


= cr,. 


{N+1)L ■ 


(27) 


(28) 


As described in Sec. V A we use crts = 0.2 and N = 
1500. For one observed state variable, L = 1, and a 
perfect solution of the estimation problem, we get the 
lower boundary for Eq. (23), of 




1± 



0.04 ± 0.00146 . 


(29) 


Note, that in a real world experiment and hence 
typically can not be computed due to the unknown 
true trajectory {*z(n)}. Another possibility to compute 
the accuracy of the estimated model state variables and 
the fixed model parameters is to compare predictions of 
the model via the measurement function h(x(t)), Eq. (2), 
with available (noisy) data after the estimation window. 
To compute the prediction the model Eq. (1) must be 
integrated starting at the end of the estimation window at 
tjsf using the estimated value as model parameter and 
)jX°(7V) as initial guesses. Next, the prediction {),x'=(n)}, 
n > N, can be compared with observed data {^r]‘^{n)}, 
n > N, hy computing the prediction error 

W+Afp,ed 

(32) 

for A^pred time steps using the same step size At as for 
computing the true trajectories. Due to noise in the data 
the prediction error cannot vanish and we consider a 
prediction as successful, if ),PE'^ < 0.5. Analogous to 
Eq. (22) we define the success rate of the prediction (in 
percentage) 


Note that only the standard deviation depends on the 
number of measurements (and is largest for L = 1), 
but not the expectation value. This means that with a 
smooth estimate for the model state variables one can not 
go below this boundary. If one goes below this threshold, 
the measurement noise is modelled and one has not esti¬ 
mated a smooth solution for the model variables. In this 
case one should choose a smaller a in the cost function 
Eq. (10). Note that the modelling of the measurement 
noise is still possible if one does not fall below this bound¬ 
ary. Because of the perfect model scenario in our twin 
experiments we can expect a value for which is only 

slightly larger (due to small numerical errors) than the 
lower boundary of 0.04± 1.46 • 10“^. To cover these cases 
we introduce an empirical margin of 0.005 which is added 
to the lower bound of 0.04 and we consider an estimation 
as successful if ^Ef^^ < 0.045. Applying this bound to 
the error given by Eq. (23) we can define a success rate 
(in percentage) 


with Cfi 


100 % 

’iguess 


N- 

' iguess 

£ e/,. 


h^l 


fO, if > 0.045 

\1, if < 0.045 ■ 


(30) 


In a similar way, the success rate of the estimation of 
the model parameter (in percentage) is defined as 


(T) 


with ph 


100 % 

i fnipss 


Ni, 


h=l 


1 0, if Ip^ i [8.16,8.18] 
\1, ii^p^ G [8.16,8.18] 


(31) 


(*PE=) = 


100 % 


Nu 


h^l 




(33) 


with 


Pe/i = 


0, if *^PE^ > 0.5 
1, if IPE^ < 0.5 


describing for how many of the different initial guesses 
with the same true trajectory and the same combination 
of measured state variables the prediction was success¬ 
ful. In contrast to \E'^ the prediction error jjPE'^ can be 
computed using measured data only. 


B. Different methods for generating initial guesses 

For the optimization process initial guesses for the 
model state variables and the fixed model parameter p 
have to be chosen. In our simulation we considered three 
methods for preparing initial guesses according to rules 
specified below. For each case Mguess = 500 different 
guesses ({),x(n)}, ^p) with h = 1,..., Aig^ess are gen¬ 
erated. In all three cases the model parameter hP is 
picked equally distributed from the interval [4,20]. In 
those cases where the initial guess {)jx(n)} for the model 
state variables does not depend on the true trajectory 
{*z(n)} of the estimation problem, the index i will be 
neglected (i.e. )jX(n) = In the following, three 

different methods of choosing the initial guesses will be 
used and evaluated: 

1. Uniformly distributed samples in a box: For 

each initial guess each model state variable j^Xdin), 
d = 1,..., D at each time step is an equally dis¬ 
tributed random number in the interval [—9,14]. 
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This interval has been chosen because it is the 
range of typical oscillations of all state variables 
of the Lorenz-96 model. Together with the model 
parameter the initial guesses consist of D ■ N +1 = 
£*iguess = 13501 numerical values. In other words, 
the initial guesses are uniformly distributed points 
in a box in a Higuess dimensional space 

2. Exact solutions of the model: Each initial guess 
({^x(n)}, i^p) is an exact solution of the Lorenz-96 
model Eq. (3). The initial values ^x(O) of these 
trajectories are arbitrary points on the attractor 
generated with p = 8.17 (not coinciding with the 
initial conditions of the true trajectories). 

3. Samples close to the true solution: These 
initial guesses depend, in contrast to methods 1 
and 2, on the “true trajectories” {*z(n)} with 

1 = 1,...,18 (see Sec. VA). The estimation pro¬ 
cesses will be initialized with a “noisy” version 
of {*z(n)}. More precisely, for each time step 
uniformly distributed random numbers from the 
interval [—15,15] are added to the values of the 
true state {*z(n)} to generate the initial guesses 
{5,x(n)}, Compared to initial guess strategy 1 and 

2 this strategy does depend on the true trajectories. 
In a real world application, where the true trajec¬ 
tories are not known, this strategy can not be used 
in contrast to methods 1 and 2. 


C. Interpretation of the simulation as Bernoulli 
experiment and error estimation 

As described in Sec. V A for each of the 18 true 
trajectories and each of the 15 combinations of mea¬ 
sured state variables, the Lorenz-96 model was adapted 
IViguess = 500 times to the corresponding (multivariate) 
time series using a specific method for choosing the initial 
guesses. If < 10“^ (Eq. (21)) then the estimation of 
the model state variables is considered as successful. This 
simulation can be interpreted as a Bernoulli experiment, 
because each of the independent A^iguess estimations of 
the model state variables and the fixed parameter is a 
Bernoulli trial with the outcome successful or not suc¬ 
cessful. The standard error of the Bernoulli process is 
given by 

, yy(100% - y) 

/W- ’ 

■y ' iguess 

whereas y G [0%, 100%] is the expectation value of the 
percentage of successful cases (index i describes the used 
true trajectory and index c describes the combination 
of measured state variables). Unfortunately, we do not 
know However, we can determine the maximum of 
the standard error ^e^ which occurs for y = 50%. With 
IViguess = 500 trials the maximal standard error equals 
*emax ~ 2.24% and hence is sufficiently small. 


D. Results 

1. Estimation Error 

The simulation described in Sec. V A was performed 
with all three methods for choosing initial guesses for 
the model state variables and the fixed model parameters 
({y(n)},^p) as described in Sec. VB. For each method 
of choosing the initial guesses the percentage of success¬ 
ful estimations, Eq. (22), was computed, where 

an estimation of the model state variables is considered 
as successful if < 10“^, Eq. (21) (see Sec. VA). 
The estimation of the model parameter is considered as 
successful if \p’^ G [8.16,8.18]. The success rate for the 
fixed model parameter, (y), is defined in Eq. (31). The 
statistic (percentage of successful estimations) was cre¬ 
ated for each of the 18 true trajectories {*z(n)} (indexed 
by i), each of the 15 combinations of observed state vari¬ 
ables, c, and all fViguess = 500 initial guesses (indexed by 
h). 

Tables Ia,b show the results for method 1 (uniformly 
distributed samples in a box). The tables show 
(Table la) and {'‘ff') (Table Ib) for each combination of 
a true trajectory {®z(n)} and a particular choice of mea¬ 
sured state variables. If three variables are measured, the 
rate of successful estimations of the model variables and 
the fixed parameter is (on average) higher for all com¬ 
binations of measured state variables compared to the 
success rate for multivariate time series with only two 
variables. Nevertheless, certain combinations with two 
observed variables {xi,X 2 )^ {xi^x^) or {xi,Xa) also give 
success rates that are only slightly lower than combina¬ 
tions of three observed state variables. They just appear 
less often compared to time series with three observed 
state variables. When (a;i,a; 5 ) are observed the estima¬ 
tion of the model state variables and the fixed parameter 
does not seem to work very well. None of the 18 trajecto¬ 
ries considered here exhibit high success rate. If only Xi 
is observed the estimation of variables and the parame¬ 
ter fails for all 18 trajectories. As one might expect, one 
can see a high correlation between the success rate for 
the state variable estimation (Tab. la) and the success 
rate for the parameter estimation (Tab. Ib). The success 
rates depend not only on the combination of observed 
variables only, but also on the trajectory {*z(n)} used to 
generate the time series (i.e. the starting points on the 
attractor). 

In Tab. II the success rate of the error defined by 
Eq. (30) is shown. Compared to Eq. (22) this success rate 
can be computed from the data and the estimated model 
state variables only. One can see a high correlation be¬ 
tween Tab. la and Tab. II indicating that is a good 
approximation of (at least in the absence of errors in 
the model equations, as in these simulations). There are, 
however, some discrepancies. For example, if {®z(n)} is 
the true trajectory and c = 1 — 2 is measured. Tab. la 
shows a much smaller success rate, given by Eq. (22) 
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Figure 2. This figure show two examples where the Lorenz-96 model Eq. (3) was adapted to a (multivariate) time series rjj. The 
(unmeasured) model state variables Xj and the fixed model parameter were estimated using the estimation method 1 described 
in Sec. IV. The output of the measurement function is yj and the true trajectory is Zj (unknown to the estimation algorithm). 
The estimation was performed for 0 < t < 15 and the prediction of the model variables for 15 < t < 18 (right of the vertical 
black dashed line at t = 15). Left column, (a), (c), (e), (g): xi and X 2 are measured (c = (1 — 2)) and i = 3, h = 385. The 
estimation error of model variables is larger than the prediction error, = 1.73 > = 0.054. Right column, 

(b), (d), (f), (h): xi and *3 are measured (c = (1 — 3)) and i = 1, /i = 140. The estimation error of model variables is smaller 
than the prediction error, = 6.1 • 10“^ < i 4 oPE^^“®^ = 1.98. 


(only the error of all model variables is considered), com¬ 
pared to the success rate, given by Eq. (30), in Tab. II 
(the error of measured state variables is considered only). 
This shows that a good estimation of measured variables 
does not necessarily mean that unmeasured variables are 
also estimated correctly. 

With initial guess method 1 the initial guess for each 
of the 9 model variable at 1500 locations along the (ini¬ 
tial) trajectory is a random number (equally distributed) 
from the interval [—9,14] (see Sec. VB). With the guess 
for the unknown parameter the full initial guess is a point 
in a Higuess = 13501 dimensional box. Scanning this en¬ 
tire Higuess = 13501 dimensional rectangular box con¬ 
taining the initial is not an appropriate method to learn 
something about the basin shape of the optimal solu¬ 
tion. Nevertheless, one can interpret the success rate 
as the ratio of the size of the basin of successful es¬ 
timates and the volume of the box in the space^® 

in percentage. Using the initial guess method 2 (exact 
solutions of the model) in VB one can create a similar 
statistic (not shown here). We found that the success 
rate for the state variables and parameter estimation is 
almost zero in many if not most cases for all combinations 
of observed state variables and all true trajectories, i.e. 
method 2 gives worse success rates compared to method 
1 (uniformly distributed samples in a box). 

As discussed in Sec. VB using initial guess method 


3 (samples close to the true solution) is usually not ap¬ 
plicable in a real world estimation process, because the 
true trajectories are usually not given. We use it here 
to estimate the basin size around the true trajectories 
and it turns out that initial guesses uniformly sampled 
in a “tube” around the true trajectories with a radius 
of 15 (which is larger than the amplitude of the oscilla¬ 
tions) provide correct estimates with a very high success 
rate. This means that the optimal solution is not only 
locally observable but possesses a basin of considerable 
size. However, this basin is bent/curved in a very high 
dimensional space. 


2 . Prediction Error 

In contrast to considering and only, we 

also consider the success rate of the prediction, 

Eq. (33). For initial guess method 1 the prediction suc¬ 
cess rate (*PE'^) is shown in Tab. III. Remember, that 
jjPE'^ can be computed using the solution from the es¬ 
timation process and the measured data {*J7°(n)} only, 
provided data for N < n < N + Np^ed are available. The 
prediction was computed for iVpred = 300 time steps. 
Here, due to noise in the data, an estimate of the model 
state variables is considered as successful if j^PE^ < 0.5. 
Note that “successful”’ here does not necessarily mean 
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(a) Observability of model state variables 
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50.4 

81.8 

84.6 

68.4 

53 

92.6 

{ilz(n)} 

0.6 

1.2 

1.8 

0.6 

14.6 

4.2 

63.8 

90.4 

23 

61.2 

63.4 

59.2 

57.8 

82 

87.6 

{I2z(n)} 

0 

53.8 

1.4 

5.8 

1.2 

88.8 

95.4 

89 

2.4 

4.2 

14.8 

45.8 

3.8 

0.8 

88 

{I3z(n)} 

0.2 

6.6 

27.8 

0.4 

23.2 

82.8 

96.4 

84 

60.8 

93 

86 

67.2 

75.4 

66.8 

86.2 

{I4z(n)} 

1.4 

31.4 

11.6 

2.8 

0.8 

93.8 

97 

83.4 

83.8 

6.8 

83 

57.6 

68.2 

42.8 

89.4 

{I5z(n)} 

0.4 

59 

7 

1.2 

1.2 

98.8 

97 

89.8 

44.6 

86.2 

89 

67 

36.8 

58.2 

82.6 

{I6z(n)} 

0.8 

15 

14.4 

10.4 

0.8 

95.6 

96.6 

94 

59.4 

83.2 

97.4 

36 

9.8 

85.6 

9.8 

{l'^z(n)} 

0.4 

87.8 

2 

0.6 

0.4 

84.4 

92.4 

95 

92.4 

44 

30 

16.8 

3.4 

43.8 

87.8 

00 

0 

39.6 

2.4 

2 

1.6 

88.4 

96.6 

86 

86.2 

92.4 

32.8 

16.6 

23.2 

72 

86.8 


(b) Observability of the fixed model parameter 


Table I. These tables show the results of the simulation explained in Sec. V A with initial guess method Sec. V B method 
1. For the 9 dimensional Lorenz-96 model Eq. (3) there exist 15 mathematically different combinations of one to three state 
variables constituting a multivariate time series (Sec. V A, the first rows of tables (a) and (b) show all these combinations). 
Example: 1-2-4 means that the variables Zi, Z2 and za are measured. The 18 noise-free time series {*z(n)}, i = 1,..., 18 are 
generated by integrating the model equations with different initial conditions. For each i, from {*z(n)} we extract 15 different 
time series with different combinations of state variables. According to Eq. (20), some artificial noise is added (Sec. V A). This 
results in 15 • 18 = 270 different noisy multivariate time series (cf. Eq. (20)). To each of the 270 noisy time series the Lorenz-96 
model is adapted Aiguess = 500 times using the state and parameter estimation algorithm described in Sec. IV with 500 
different initial guesses for the model state variables and the fixed model parameter chosen according to initial guess method 1 
(uniformly distributed samples in a box) (Sec. VB). For each of the 500 solutions (Eq. (21)) is computed (h = 1,... ,500). 
If < 10“^, then the variables estimation is considered as successful. The values in the tables show the percentages of 
successful estimations of (a) state variables, Eq. (22), and (b) parameters, Eq. (31). 


that the prediction of unobserved state variables is accu¬ 
rate nor that in the estimation window n € [0,..., A] the 
observed and unobserved model variables and the model 
parameter are estimated correctly (in the sense that 
is small and G [8.16,8.18]). One can see that even for 


two measured variables there are many combinations of 
{®z(n)} and the measured variables with a large 
showing successful predictions of observed variables. Fur¬ 
thermore, when only a single variable is measured the 
predictions fail for almost all true trajectories as shown 
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True Observed state variables c 


trajectory 

1 

1-2 

1-3 

1-4 

1-5 

1-2-3 

1-2-4 

1-2-5 

1-2-6 

1-2-7 

1-2-8 

1-3-5 

1-3-6 

1-3-7 

1-4-7 

{iz{n)} 

0 

49.8 

33 

42.6 

0 

95.6 

98.8 

86.8 

95.8 

83.6 

33 

50.2 

85.4 

90.2 

89.8 

iMn)} 

0 

3.6 

0 

1.6 

0 

2.6 

89.8 

90.2 

18.4 

65 

17.4 

8.6 

5.2 

15.6 

64.8 

{3z(n)} 

0 

1.8 

5.4 

0 

0.2 

66.2 

82.2 

35 

2 

39.6 

3.4 

17.6 

76.4 

87.8 

83.4 

{4z{n)} 

0 

1 

0.4 

0.2 

0 

77.8 

94.2 

90.2 

2 

87.8 

34.6 

34.8 

31 

28.8 

18.2 

{^z{n)} 

0 

67 

60.6 

0.6 

5.2 

74.8 

95 

90 

83.2 

93 

41 

86.6 

84.6 

88.2 

89.8 

{®z(n)} 

0 

37.4 

0.2 

0 

0.2 

58.2 

77.6 

80.8 

94.6 

56.2 

78.8 

7.8 

27.8 

14.4 

25.4 

Cz{n)} 

0 

2.8 

2.2 

6.4 

0 

96.4 

93.6 

76.4 

91.4 

30 

30.2 

37.8 

55.2 

36.4 

82.2 

00 

0 

95.2 

5.2 

0.8 

1 

97.2 

98.2 

92.4 

85.6 

86.6 

94.4 

83.2 

65.2 

59.8 

89.2 

{9z{n)} 

0.2 

87.8 

80.2 

76 

0.2 

96.8 

99.4 

87.4 

89 

87.8 

12.4 

89.4 

83.4 

77.8 

89.2 

{lOz(n)} 

0 

92.4 

28.4 

1.2 

2.8 

98.2 

98.4 

94.8 

85.4 

50.4 

81.8 

84.4 

68.4 

6.2 

4.2 

{ilz(n)} 

0 

0 

0.4 

0.6 

5.2 

4 

63.8 

89.2 

1.6 

60.8 

61.8 

58.8 

49.4 

81 

87.2 

{I2z(n)} 

0 

24.8 

1.2 

1.6 

0 

88.6 

95.4 

89 

1.6 

3.6 

12.2 

20.8 

2.6 

0.6 

86.4 

{I3z(n)} 

0 

4.8 

27.2 

0.2 

3.6 

82.8 

95.6 

81 

7.2 

92.6 

86 

67.2 

67.8 

66.4 

86 

{I4z(n)} 

0 

30.6 

7.8 

1.4 

0 

90.6 

95.8 

78 

83.8 

6.6 

79 

56.2 

68 

42.6 

88.6 

{I5z(n)} 

0 

47.2 

1 

0.2 

0 

85 

96.4 

89.4 

44.6 

85.8 

87.4 

66.8 

36.4 

58.2 

82.2 

{I6z(n)} 

0 

14.4 

14.2 

0 

0 

95.4 

96 

94 

59.4 

76.6 

96.6 

27.6 

9 

15.6 

7.2 

{i'^z(n)} 

0 

78.4 

1.4 

0.4 

0.2 

83.6 

92.4 

95 

92.4 

5.4 

28.4 

16.6 

3.4 

1.2 

87.8 

00 

0 

37.6 

2.2 

1.6 

0 

87.6 

96.6 

86 

86.2 

92.4 

32.8 

16.4 

22.8 

71.8 

86.2 


Table II. Similar to Tab. la, except that the values in the tables show the success rate Eq. (30) which only depends on the 
estimated model state variables and the data. 


True Observed state variables c 


trajectory 

1 

1-2 

1-3 

1-4 

1-5 

1-2-3 

1-2-4 

1-2-5 

1-2-6 

1-2-7 

1-2-8 

1-3-5 

1-3-6 

1-3-7 

1-4-7 

Ez(n)} 

0 

0 

0.2 

50.4 

0 

0.2 

0 

90 

1.2 

84.2 

0 

52.8 

88.8 

91.6 

90.4 

{2z{n)} 

0 

10.4 

1.2 

23.4 

3 

3.6 

91.6 

0.4 

19 

74.8 

66.8 

86 

59 

19.6 

71.2 

{3z(n)} 

0.2 

37.6 

22.4 

4.6 

28 

66.6 

95 

89.8 

71.6 

98.8 

31.2 

21 

87.2 

95.4 

95.6 

{^z{n)} 

0 

1.8 

0.6 

8.4 

11.6 

78.2 

95.2 

93.8 

27.8 

96.4 

85 

66 

79.4 

31.2 

94.8 

{5z{n)} 

0 

67.2 

63.4 

0.8 

5.4 

92.2 

96.6 

90 

84.2 

93.4 

0.6 

92.4 

90 

92 

92.8 

{6z(n)} 

0 

37.4 

0.6 

1.2 

4.8 

60.2 

78 

82.2 

98.6 

57.8 

86 

16.8 

29.8 

38.6 

26.4 

Cz{n)} 

0 

0 

2.2 

7.4 

0.6 

1 

94.2 

4.2 

1 

37.6 

30.4 

41.2 

57 

36.6 

82.6 

00 

0 

0 

0 

2.2 

3 

0.6 

0.6 

1.4 

0 

86.8 

0 

86.4 

0 

61.8 

96.2 

{9z{n)} 

0.2 

94 

83.4 

82 

1.6 

0 

99.4 

87.4 

91.4 

98 

88.4 

91.4 

88.8 

93.8 

96.4 

{lOz(n)} 

0 

0 

58.8 

1.2 

0 

98.2 

98.4 

0 

1.2 

0 

0 

0 

0.8 

52.4 

92.4 

{llz(n)} 

0 

3.6 

2 

2.2 

18.8 

5.4 

67.2 

91.2 

23.2 

61.6 

64 

85.6 

51.8 

86.2 

91.6 

{I2z(n)} 

0 

55.4 

36 

7.2 

0.4 

90.4 

96.4 

92.4 

20.8 

5.4 

25.8 

61 

80.6 

15.8 

94.4 

{I3z(n)} 

0 

0.2 

38.2 

3 

28.6 

83 

1 

1.2 

0.8 

0 

0 

68.6 

77.8 

78.4 

2.8 

{I4z(n)} 

0 

31.4 

1.4 

4.4 

0.2 

0 

97.4 

1.8 

83.8 

7.4 

84.4 

5.6 

68.2 

94.8 

93.4 

{I5z(n)} 

0.2 

70.6 

29.2 

11.4 

4.4 

99.8 

99.2 

98.8 

97.8 

99.8 

96.4 

72.6 

57.6 

92.2 

94 

{I6z(n)} 

0 

0.2 

0 

0 

0 

95.4 

0 

0 

0.2 

0 

0 

2 

15.6 

0 

0.6 

{l'^z(n)} 

0 

78.4 

2 

10 

0.4 

0 

0 

0.2 

1.4 

0 

0 

50.2 

3.4 

43.8 

2.8 

00 

0 

43.2 

17.4 

46.4 

11.2 

91 

97.2 

97.8 

94 

98.4 

92.2 

58.6 

83.6 

92.8 

92 


Table III. This table show the statistic of the prediction error for initial guess method 1. The table the table has to be interpreted 
in the same way as Tab. la. In contrast to Tab. la the numbers show Eq. (33), which is the percentage of successful 

predictions by considering the prediction error ),PE“, Eq. (32). An estimation is considered as successful if )jPE° < 0.5. The 
length of the prediction window is A^pred = 300 and time steps of length At = are used. 


by « 0%. These results are consistent with 

the results obtained from Tab. la, although on average 
the percentages have smaller numerical values. Never¬ 
theless, there are cases where is large and 

is small (example: c = (1 — 2), i = 3) and vice versa 
(example: c = (1 — 3), t = 1). For both cases estima¬ 
tion and prediction examples are shown in Fig. 2 left col¬ 
umn > gggPE^^”^^) and Fig. 2 right column 

< } 4 qPE^^“^^). This means that the correla¬ 
tion between and (“PE'^) is strong but not perfect. 

A good prediction does not necessarily mean a good esti¬ 
mation during the estimation window. It rather indicates 
that if the prediction error is small then the estimation 


of unobserved state variables and parameters is good. 

Using the initial guess method 2 (exact solutions of the 
model), we found that the success rate (“PE*^) is almost 
zero in many if not most cases for all combinations of 
observed variables and all true trajectories, i.e. method 
2 gives worse success rates compared to method 1 (uni¬ 
formly distributed samples in a box). The same was ob¬ 
served when considering A possible explanation 

for this observation is the fact that, for exact solutions, 
the term Ci (Eq. (11)) in the cost function C is the only 
term significantly different from zero, such that the ini¬ 
tial values result in a relatively small value of the total 
cost function and this may increase the probability to be 
close to (and kept in) a local minimum. 
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When initial guess method 3 (samples close to the true 
solution) was used, we observed that most success rates 
of the prediction are close to 100%. Neverthe¬ 

less, there are also combinations of a true trajectory and 
measured state variables with a success rate close to zero 
(especially for one and two measured variables) although 
corresponding success rates are high. The most 

likely reason is the chaotic dynamics of the model and 
therefore the fast divergence from the data when com¬ 
puting the predictions. 


VI. DISCUSSION AND CONCLUSION 

Using a chaotic 9-dimensional Lorenz-96 model as a 
prototypical example we studied observability of all its 9 
state variables Xi and the fixed model parameter p us¬ 
ing different multivariate time series consisting of one to 
three observables. Local observability was characterized 
by a recently introduced measure of uncertainty Vi given 
in Eq. (7). This analysis indicates that all state vari¬ 
ables and the parameter can be reconstructed, even in 
cases where only a univariate time system is available. 
It turned out that on average the values of Vi for un¬ 
measured state variables are minimal for a delay time r 
between r = 0.11 and r = 0.21 (see Fig. 1). This is in 
agreement with results reported in Ref. ’ where r « 0.1 
was found to be an appropriate delay time to synchro¬ 
nize a Lorenz-96 model to an observed time series using a 
delay coordinates based coupling scheme. Histograms of 
the uncertainties iZj of the fixed model parameter and the 
measured and unmeasured state variables look similar, 
independent of the number of measured variables. This 
means that the successful reconstruction of the state x(t) 
and the parameter p should not depend on the number of 
measured state variables in a (multivariate) time series, 
provided one initializes the estimation algorithm close 
enough to the true solution (note that the observability 
analysis presented in Sec. Ill is only locally valid). 

In Ref.® we showed that for the Lorenz-96 model syn¬ 
chronization to the data is indeed possible with only a 
single measured state variable, only, using a synchroniza¬ 
tion scheme based on delay vectors of the data time se¬ 
ries. Hence this result is in coincidence with the fact that 
the uncertainty values Vi are relatively small already for 
univariate time series from the Lorenz-96 system. 

Furthermore, we addressed the question whether the 
estimation of the model states is also possible if an es¬ 
timation algorithm is initialized further away from the 
true trajectory of the dynamical system underlying the 
data. To probe this global convergence a statistical test 
was performed where an optimization based state and pa¬ 
rameter estimation algorithm* was initialized with differ¬ 
ent initial guesses for the entire trajectory and the model 
parameter. Three different methods for generating the 
initial guesses were used (see Sec. VB) and compared. 

With method 1 initial guesses were chosen uniformly 
distributed in a box. With this preparation of initial 


guesses of the optimization algorithm state and parame¬ 
ter estimation in the 9-dimensional Lorenz-96 model was 
possible with a very high success rate if multivariate 
times series with (at least) three observables are avail¬ 
able, while for two measured state variables, only a frac¬ 
tion of estimation runs was successful (see success rates 
summarized in Tabs. la and Ib). Note, that the initial 
guesses generated by method 1 are typically far off the 
trajectory underlying the data. As a consequence state 
and parameter estimation based on univariate time series 
failed in most cases. Therefore, for practical application 
local observability is a necessary but not a sufficient fea¬ 
ture of the given estimation problem. 

Furthermore, it was shown that an error definition 
based on the difference between the estimated solution of 
the model variables and the noise free true trajectory (of 
all variables), Eq. (21), gives comparable success rates as 
an error definition based on the difference between the 
measurement function and the data, Eq. (23). For the 
latter, a lower boundary was derived which is valid for a 
smooth solution. Note, that in all simulations the model 
equations have no errors. The question of whether both 
error definitions would give comparable results if errors 
in the model equations are present, was not addressed. 

Using exact trajectories (not coinciding with the true 
trajectory underlying the data) as initial conditions 
(method 2) turned out to result in very poor estima¬ 
tion results. Hence, initializing the estimation algorithm 
with an arbitrary solution of the model equations is a 
disadvantage compared to random initial guesses. 

High success rates (close to 100%) were obtained using 
initial guess method 3 where the estimation algorithm is 
initialized with samples close to true solutions. These 
results are consistent with the low uncertainty observed 
in the local observability analysis. Note, however, that 
usually this initialization method can not be applied with 
real world data, because the true trajectories used to gen¬ 
erated the initial guess are typically unknown. 

In addition to considering the success rate of the es¬ 
timation, Eq. (22), which can only be computed if the 
clean trajectories of all state variables are known (often 
only one variable can be measured), the success rate of 
prediction, Eq. (33), was considered. This prediction er¬ 
ror is more suitable for real world applications, because 
it can be computed based on measured data and the esti¬ 
mated model state variables and does not require further 
information about the dynamics. In the example con¬ 
sidered here a correlation between the prediction error 
and the success rate of the estimation was observed in¬ 
dicating that the prediction error is a good measure for 
the success of the estimation procedure. Nevertheless, it 
was also shown that a small estimation error does not 
necessarily mean a small prediction error and vice versa. 

Our results indicate that successful state and parame¬ 
ter estimation crucially depends on the selection of avail¬ 
able observables (univariate vs. multivariate), on the tra¬ 
jectory segment underlying the time series (i.e., the re¬ 
gion of the state space the trajectory visits during mea- 
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surements), and last but not least, the initialization of 
the estimation algorithm. The first two aspects are typi¬ 
cally determined by and during the measurement (or ex¬ 
periment) and cannot me changed afterwards. Only the 
choice of the estimation method and of its initialization 
is (typically) in the hand of the person who is analyzing 
the data. Using a representative algorithm from the class 
of optimization based methods (similar to 4D-VAR) we 
demonstrated that the success may crucially depend on 
a proper choice of initial guesses. Finding suitable crite¬ 
ria and initialization strategies is thus an important open 
task for future research on state and parameter estima¬ 
tion algorithms. 
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APPENDIX: JACOBIAN MATRIX OF THE DELAY 
COORDINATES MAP 


The Jacobian of the delay reconstruction map Eq. (6) 
with respect to x(<) and p is given by 


/ Dxh(x(t)) 

Dxh((()^(x(t),p)) • Dx<()^(x(t),p) 
Dxh(^!)2^(x(t),p)) • Dx<()^’'(x(t),p) 


0 \ 

Dxh((;i'"(x(t),p))) • Dp0^(x(t),p) 
Dxh((/)2-"(x(t),p)) 


\Dxh((()(^ i)^(x(t),p)) • 0x0^^ i)’'(x(t),p) Dxh((^(^ i)'^(x(t),p)) • i)^(x(t),p)/ 


where 


Dxh((/)^'(x(t),p)) 


Dx(/)^'(x(t),p) 


Dp(/)^'(x(t),p) 


/ 9^1 
f dxi 

dhi \ 
dxM \ 

[ dhg 

V dxi 

dhn j 
dxM ' 

( 

dx\ 

dFi \ 
dxM 

94>n 

\ dxi 

d<Pn 
dxM / 

■■ 

dpp 

94‘n 

\ dpi 

dpp / 


(36) 


with t' = 0, T, 2r,..., {K — 1)t. To compute the Jaco¬ 
bian matrix DSx.p(x(t), p) (35) of the delay coordinates 
map S(x(t),p) we have to compute the Jacobians (36) 
where (x(^))P) a-nd (x(t),p) contain deriva¬ 

tives of the flow (jf generated by the dynamical system 
(1) with respect to state variables Xj and parameters pj, 
respectively. The D x U-matrix Dx(/>’’ (x(t),p) can be 
computed by solving the linearized dynamical equations 
in terms of a matrix ODE 



DxF {(l)'^{x{t),p),p) • Y(r) 


(37) 


where (x(t), p) is a solution of Eq. (1) with initial value 
x(t) and Y(t) is an Zl x D matrix that is initialized as 
Y(r = 0) = Id, where Id denotes the D x D identity 
matrix. Similarly, the D x P-matrix Dp^'’’(x(t),p) is 
obtained as a solution of the matrix ODE'^^ 

^Z(t) =DxF ((;i’'(x(t), p), p) • Z(r) 

+ DpF(^^(x(t),p),p) (38) 

with Z(r = 0) = 0. DxF(...) and DpF(...) denote 
the Jacobians containing derivatives dFi{...)/dXj and 
dFi{.. .)/dpj, respectively. Solving (37) and (38) simul¬ 
taneously with the system ODEs (1) we can compute 


Dx<?ii'"(x(t),p) =Y(t) 

(39) 

Dx(/i^’'(x(t),p) =Y(2 t) 

(40) 

Dp<?i'’'(x(f),p) =Z(r) 

(41) 

Dp()|2’'(x(t),p) =Z(2r) 

(42) 


and use these matrices to obtain the Jacobian matrix 
DxpS(x(t),p) Eq. (35) of the delay coordinates map S 
Eq! (6). 
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